Best Practices for 
Open Sound Control 

Linux Audio Conference 2010 / Utrecht NL 

Andrew Schmeder (presenting), 

Adrian Freed, 

David Wessel 

Email Authors: 

{andy,adrian,wessel}@cnmat. berkeley.edu 

CNMAT / UC Berkeley 
http://cnmat.berkeley.edu/ 


uwD 


Monday, May 3, 2010 






Overview: 

- What is Open Sound Control? 

- What does OSC practice include? 

- Definition of audio control data, examples. 

- Temporal quality assurance. 

- Transport layer considerations. 

- Description strategies for control data. 

- Programming for audio control. 
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What is 
Open Sound 
Control? 

Section 1 
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What is OSC? 


- Open Sound Control (OSC) is a content format 
for messaging among computers, sound 
synthesizers, and other multimedia devices that 
are optimized for modern networking technology. 

- Wikipedia.org 
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What is OSC? 


- A collection of ideas and practice for realtime audio 
control. Based around a descriptive document of 
the format and code (the "OSC-Kit") published by 
Matt Wright at CNMAT circa 2002. 

- Now, lots of diverse implementations in applications 
and embedded software 

- OSC, which is pronounced "oh-ess-cee", or 
sometimes "osk", stands for Open Sound Control. 

- Actually not going to be called "Open Show Control" 
as of April 1st 2010. 
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What is Open? 

- No license requirements 

- No patented algorithms 

- No conformance certification 

- No strict specification of 
requirements 

- Lots of open source code 
available 
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What is not Open? 

- Design: at the whim of its benevolent dictators 
- Acceptance criteria for a new idea: 

- Appropriate to the scope of OSC definition 

- Established need 

- Can be used in closed-source products, 
provided the implementation has a compatible 
license. 
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OSC is a Content Format 


OSC is not a Standard: 

- no conformance certification. 

OSC is not a Protocol: 

- no convention for detection, 
negotiation, error handling 

OSC is a Content Format: 

- a content format is a structured 
container of primitive data types. 




OSC Primitive Types 

- strings (human readable) 

- numbers: int32, IEEE 754 float single 

- optional types: int64, double, etc 

- "blobs" (byte arrays) 

- time 

- 't' typetag as NTP time, 

- pair of uint32 {seconds, seconds fraction} 
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OSC Structure 


OSC Bundle 


Bundle 

Identifier 

#bundle 


NTP Timestamp 


Seconds Seconds 
Fraction 


Encapsulated Message(s) 



(...) 
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OSC practice 

OSI Layer OSI Layer # Topic 

Application 7 Control Semantics, Choreography 

Presentation 6 OSC Structure 

Session 5 Discovery, Enumeration, Authentication 

Transport 4 Quality of Service 

Network 3 Bandwidth Reservation 

Frame 2 Clock Synchronization 

Hardware 1 Cabling, Wireless, Power 



























Definition for Audio 
Control Data and 

Examples 

Section 2 
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Audio Control Data 


- Any time-based information 
related to an audio stream 
other than the audio 
component 

- Non-time-based audio- 
related information are 
static stream properties 

- You can use OSC for this 
but its not "intended" 
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Properties of Audio Control 

- Temporal errors can produce audible side- 
effects, 

- "zipper" noise 

- spatialization aliasing errors 

- low quality interactivity (boring instruments) 

- Variable sample rates, mixed rates 

- Audio systems may have sensitive or high- 
power hardware components needing robust 
control 
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Examples 

- Instrumental gesture data 

- Spatial auditory scene parameters 

- Spatial rendering engine control 

- Audio synthesis engine control 
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Instrumental Gestures 


- Constrained by limits of mechanical and neural 
human body dynamics 

- With training, gestures can be repeatable with 
very high precision in time and space 

- Delay Tolerance in performance: 20msec round 
trip delay (Chafe et al) 

- Temporal Repeatability: ~10hz continuous 
motion, 10msec accuracy, 1msec precision, 
lOOOhz SR (Wessel) 
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Information Rate 

/ = log 2 (1 + —^ bits 

- Essentially, number of bits that change per second. 

- Is the fundamental determinant in Fitt's Law 

- ISO-1941-9 (measurement of information transfer 
rate in target selection, mouse = ~3 bits/sec) 

- Instrumental gestures are ~ 100bits/sec in time 
dimension alone. IR in space/force to be determined. 
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Spatial Scene Parameters 


Source Width 






Ensemble Width 


Room Width 


Scene 

Auditory Spatial Schemata (Gary Kendall) 

- Source location, width, directivity 

- Diffusion from enclosing geometry (rooms) 

Sub-audible frequency band (0-40hz) 
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Spatial Rendering Engine 

- Examples including driving a distributed array with 
Ambisonic/Wave-Field-Synthesis filter coefficients: 

- Temporal error is equivalent to transducer 
positioning error 

- Temporal sync within 5% of sample frame at 
max controlled frequency 

- 500 microseconds (usee) at 96,000hz 

- AES2003-11 Best Practices for Network Audio 
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Audio Synthesis Engine 

- Data-driven analysis and synthesis algorithms 

- Granular, concatenate, additive synthesis, 
large filter banks 

- Can have very high bandwidth: thousands of 
entities per second 

- Sub-sample accuracy (500 usee is good enough) 

- float32 is good enough for 500usec accuracy 
(but only just barely). 
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Temporal Quality 

Assurance 

Section 3 
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Temporal Quality Assurance 

- Bounds on various delay properties: 

- maximum, minimum 

- variance. 

- accuracy and precision of scheduling 
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Interrupt Service Jitter 

- Hardware or software gets some data and raises 
and interrupt service request. 

- Data goes into a buffer until the interrupt is 
serviced by the system scheduler, then it gets 
delivered 

- Interrupt servicing has delay distribution of a 
random wait-time queue 
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Random delay from buffered I/O 



Typical ISR variable delay 3-10msec 
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1.0 


lOhz Signal 
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lOhz Signal under l-3msec variable delay 
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Jitter induced noise on lOhz carrier 
(24db => ~4 bits resolution) 
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Random delay from buffered I/O 
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Forward Synchronization Scheduler 
(implementation is a priority queue) 
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0.01 msec 

0.1 msec 

1. msec 

2. msec 

4. msec 

0.5 Hz 

100.806 

80.942 

60.5853 

54.4588 

48.2834 

1. Hz 

89.4672 

69.2973 

49.5129 

42.7719 

37.1899 

2 Hz 

83.5256 

64.1865 

44.4936 

37.811 

32.166 

4 Hz 

77.8606 

58.3905 

38.2024 

32.4498 

25.4497 

8 Hz 

72.3401 

52.0053 

31.2989 

25.7653 

20.1786 

16 Hz 

66.1133 

45.8497 

25.8291 

19.7408 

14.3312 

32 Hz 

60.2471 

39.6844 

19.7202 

13.546 

8.26448 

64 Hz 

53.9285 

33.8882 

13.9203 

7.90135 

1.7457 


Carrier frequency vs jitter magnitude, 
BOLD => less than 8-bits headroom 
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Summary of Jitter 

- For typical control frequencies in the sub-audible 
bandwidth 0-40hz, typical transport jitter of a 
few milliseconds is unacceptable 

- Best effort is not good enough. 

- Forward sync scheduling can remove some jitter 
problems (maybe to 0.1 msec) 

- For audio apps a better solution is to synchronize 
physical-time with sample-time 

- Using a DLL filter, interpolation strategies etc. 
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Transport 

Considerations 

Section 4 
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Transport Topics 


- Types of transports and their properties, e.g. 

- UDP 

- TCP 

- Serial (USB, RS232, file pointers) 
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Ethernet AVB 


- A solution for the endpoint-discovery and 
connection management problem (Bonjour/mDNS 
+ AVBC) 

- Uses bandwidth reservation protocols to ensure 
network availability with bounded delay (2msec, 
Class A) 

- Solves clock synchronization problem at the 
ethernet frame layer, (500 usee per AES2003-11) 

- OSC can be sent over AVB streams using a MIME 
type identifier (1722.1 working group) 
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Describing Audio 
Control Data 

Section 5 
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Describing control data 

- Four design patterns for describing a 
software interface: 

- RPC: Remote Procedure Call 

- REST: Representational State Transfer 

- OOP: Object Oriented Programming 

- RDF: Resource Description Framework 
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By Example... 


Channel #3 
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RPC 

Remote Procedure Call 


/setgain (channel number = 3) (gain value = x) 


Functional, reference-oriented semantics 
Good for highly-dynamic data structures 

(granular synth) 
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REST 

REpresentational State Transfer 


/channel/3/gain (x) 


Emphasis on enumeration of resources 
Encourages stateless protocols 
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OOP 

Object Oriented Programming 


/channel/3@gain (x) 
/channel/3/setgain (x) 

=> attribute (after XPath) 
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Resource Description Framework 


/channel,num,3/op,is,set/lvalue,is,gain (x) 


Unordered set of semantic triples of 
{subject, predicate,object} 
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Programming 
Control Data 

Section 5.3 - 5.4 
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Stateless Interfaces 


- A stateful protocol is one where the 
meaning of a message has some 
dependence on a previously 
transmitted message. 
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Example 
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Stateful Encoding 

/button +1 
/button -1 
/button +1 
/button -1 
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Error Robustness 


/button +1 

/button +1 
? 

■ ■ ■ ■ 

/button -1 
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Robust State Recovery 
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Parser Complexity is 2x! 
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Stateless Encoding (REST) 

/button 0 

/button 1 
? 

■ ■ ■ ■ 

/button 0 
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Stateless Summary 


Stateful protocols are an optimization that 
reduces protocol bandwidth at the expense of 
protocol implementation complexity, 

- Especially when error recovery is involved 

- Otherwise, use TCP to ensure no errors 
(pushes complexity down to transport layer) 

Stateless interfaces can more readily support 
temporal constraints such as leases and 
expiration timestamps. 





Abstraction Layering 

- Effective strategy for 
management of complexity by 
encapsulation 

- Can have some non-trivial 
complications... 


Monday, May 3, 2010 





Device 

Abstraction 




1 


Mapping 

Transforms 




t 


Signal 

Processing 




t 


Electronic 

Sensing 




t 


Physical 

Materials 



A 



User Action 
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Multi-Layer Operations 

- Some transformation operations 
transcend the layer structure 

- Especially mapping 
transformations! 
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Example: Radio Drum 



Schloss/Matthews/Boie 
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Radio Drum Sensor 
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Radio Drum Mapping 

a+b-c-d 

a+b+c+d 


- Raw sensor to position 


lldyll = 


a + b-c-d 


1 


(a+b+c+df a + b+c+d 


+ 


1 


a+b-c-d 


a+b+c+d (a + b+c+d)‘ 


- Norm of derivative wrt. a + b, c + d 
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Ideal Mapping 



O.Q.O 


c+d 







Noise Amplification 



O.Q.O 


- 11dy11 -> infinity as (a+b,c+d) -> (0,0) 
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Layering Summary 

- Applications should maintain representations of 
control streams at multiple layers simultaneously 
when possible 

- This will support operations that need data 
from multiple layers. 
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Summary 

- Clock synchronization tolerance depends on 
temporal control information and audio application 
needs (do the calculations, don't just ignore it). 

- Ethernet AVB meets all the synchronization needs 
for audio control data, as well as bandwidth 
reservation. (500usec error, 2msec delay) 

- There are multiple effective strategies for describing 
audio control interfaces (RCP, REST, OOP, RDF) 

- Statefree protocols and multilayered representations 
can improve program reliability, enable temporal 
features, and increase flexibility. 
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The End 

Your comments / feedback: 
andy@cnmat.berkeley.edu 
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Appendix 

(omitted slides) 
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Synchronization 

- Suppose there is a set of changes to be applied 
all at once or not at all. 

- Suppose there is a set of changes that should be 
committed by time T, after which the request is 
considered to be expired. 

- In OSC we use Bundles to express frames of 
temporally-synchronized data. 

- The quality of distributed synchronization is 
limited by the clock distribution error, which is a 
network layer or frame layer service. 
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OSC Structure 


OSC Bundle 


Bundle 

Identifier 

#bundle 


NTP Timestamp 


Seconds Seconds 
Fraction 
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Framing 

- OSC needs a transport that includes a 
framing structure (such as datagram 
messaging, UDP) 

- Any serial transport can be adapted to 
support framing with a frame encoding: 

- SLIP RFC1055: a byte-quoted encoding 
that is robust to interruption (with double- 
ended variant) 

- int32 length preamble: requires an assured 
serial transport (TCP) (see OSC 1.0) 
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Routing 

- Implementations should expose as 
much detail as possible from the 
network routing layer, so that 
applications can make full use of 
routing capabilities of the transport. 

- e.g., bidirectional UDP 

- Reverse NAT traversal (OSCgroups) 
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Bulk Transports 

- File pointers and databases can be treated as 
classes of serial transport having a bulk-IO 
delay distribution model (OSC Stream DB). 

- On serial transports we use a SLIP 

RFC1055 encoding to provide a datagram 
framing around OSC 
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OSC Stream DB 


OSC 


Application 


Real-time Interface 


Commands 

Interface 


1 


Read 

Stream 

i 

i 

Write 

Stream 


/play 

/filter 

/index 


#bundle 
query results 


#bundle 
to record 



Informational 
Access Control 


Forward 

Synchronization 

Scheduler 


Real-time Gesture 
Streams 


OSC Stream DB 
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